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The detection of complexity in behavioral outcomes often requires an estimation of their variability 
over a prolonged time spectrum to assess processes of stability and transformation. Conventional 
scholarship typically relies on time-independent measures, 'snapshots', to analyze those outcomes, 
assuming that group means and their associated standard deviations, computed across 
individuals, are sufficient to characterize the educational outcomes that inform policy, and that 
time does not matter in this context. In its statistically abstract form, the assumption that you can 
rely on snapshots is referred to as the ergodic assumption. This paper argues that ergodicity 
cannot be taken for granted in educational data. The first section discusses artificially generated 
time series trajectories to illustrate ergodicity (white noise) and three types of non-ergodicity: 
short-term correlations between observations, long-term correlations (pink noise) and infinite 
correlations (Brownian motion). A second section presents daily attendance data observed in two 
urban high schools over a seven year period to show that these data are non-ergodic and suggest 
complexity. These findings offer a counter-example to the efficacy of using snapshots to measure 
educational outcomes. 

Most research taking place from a complexity vantage point is concerned with the 
processes that constitute stability and transformation in systems, and how these 
processes depend on circumstances that are external to the system of interest. In the 
wake of the advent of chaos theory as a major subject of study in the dynamical 
literature, processes of spontaneous transformation within systems have received 
considerable attention as well. One can argue that the dynamics of stability and change 
are an almost intrinsic part of the educational process (Jorg, 2011; Koopmans, 2014; 
Stamovlasis & Koopmans, 2014; Vygotsky, 1978), and that this process is therefore a 
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particularly good case in point to study complexity processes. The field of education has 
shown high responsiveness to the substantive aspects of dynamical thinking, such as the 
use of a complexity paradigm to analyze educational politics and policy (Osberg & 
Biesta, 2010), but there are also important methodological implications that come with 
the study of stability and change in educational systems that have not yet been fully 
appreciated, in spite of comprehensive discussions of the implications of complexity for 
qualitative research (Bloom, 2011; Bloom & Volk, 2007) as well as quantitative research 
(Gilstrap, 2013). This paper focuses on the latter perspective, which invites us to consider 
one implication which has not received its due recognition in the educational research 
literature, namely the study of how behavioral outcomes fluctuate in the course of time. 
How stable are they? How does change show itself in a trajectory of successive 
outcomes? 

As in many other disciplines, conventional scholarship in education tends to capture 
phenomena 'frozen in time'. Means get computed to characterize the central tendency in 
group outcomes, and variances and standard deviations are derived to quantify the 
extent to which individual observations vary from the means in their group. We call 
these individual discrepancies measurement errors. Thus, sample estimates such as the 
mean, in conjunction with these measurement errors are used to characterize outcomes 
or characteristics of the population of interest, such as, for instance, their achievement in 
reading or math toward the end of a given school year. The efficiency that comes with 
the analysis and description of educational phenomena in terms of the behavior of a 
large number of individuals comes at the expense of the particularities of the individual 
case, which are the focus of this paper. 

When we describe a population based on snapshots from a sample, we are required 
to assume that the estimates of central tendency and variability are generalizable across 
the entire time-spectrum of interest (e.g., an entire school year), which is to say that we 
assume stability in the individual outcomes over time, and that group means and 
variances calculated irrespective of time differences are therefore sufficient to 
characterize the distribution of outcomes. In its more abstract form, this assumption is 
known in the literature as the ergodic assumption, which states that the distributional 
features that are found if behavior is measured across a large number of individuals 
carry over to those that would be found if the behavior of one given individual is 
measured across a large number of repeated measurement occasions. In other words, it 
is assumed that a within-subjects distribution of observations measured longitudinally 
would be equivalent to a cross-sectional between-subjects distribution when 
representing the possible states of a system. 

In psychology, the ergodic assumption has been famously questioned by Molenaar 
(2004) who argues that merely because we often find random variability in cross- 
sectional measurements, we cannot assume that behavioral variability over time 
displays a random pattern as well. Therefore, 'snap shots' of behavioral outcomes are of 
limited value unless the ergodic assumption is actually confirmed. Since most research 
in education is of the cross-sectional kind, the question whether a distribution of 
longitudinal observations in fact resembles the distribution of cross-sectional ones is a 
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pertinent empirical question. Verification of the ergodic assumption, then, requires the 
collection of data over a very large number of measurement occasions, the kind of data, 
in other words, whose collection is extremely rare both in psychology and in education. 

A related point is that conventional scholarship in education prefers to study 
exogenous causal processes rather than endogenous ones, which is to say that, for 
example, the impact of intervention x on behavior y is examined, without taking a 
detailed look at behavior y at previous occasions to assess y's impact on itself at a later 
date (van Geert, 2009). The focus on exogenous processes across individuals 
presupposes that hypothesized causal mechanisms that are confirmed based on group 
means are in alignment with the causal mechanism governing the behavior of each of 
the individual cases within that group (Gu, Preacher, & Ferrer, 2014). Verification of this 
assumption is of practical value because it qualifies the conclusions that are drawn from 
the statistical association between group means and it draws attention to the 
particularities of the individual case. This, in turn, can be instructive in our attempts to 
understand the relationship between mediators and outcomes, particularly as it 
concerns the transformative processes underlying this relationship (Koopmans, 2014). 
The study of the endogenous process is therefore a legitimate scholarly pursuit in its 
own right that has critical information to add to the cross-sectional comparisons that 
form the basis of most current applied research in education, which is relatively 
unconcerned with the description of baseline conditions in the system. 

There is a growing interest in the field for a more detailed estimation of how time- 
dependent dynamics affect the behavior of systems. While the traditional repeated 
outcome measures (pretest-posttest) and growth modeling designs (Bryk & 
Raudenbush, 1992; Rogosa, 2002) or short time series designs (Bloom, 1999) do concern 
themselves with outcomes in relation to the passage of time, these methods do not 
typically generate a sufficient number of measurement occasions for a fine-grained 
estimation of the time-dependent processes that is needed to elicit the underlying 
dynamics of stability and change (see also Rogosa, Floden, &Willett, 1984). 

The study of behavioral outcomes measured repeatedly over time permits 
verification of the assumption that the distribution of measurement errors across the 
time spectrum is random, and if they are not, whether they bear the marks of 
complexity. A critical step in the determination whether outcomes reveal complexity is 
to examine whether measurement errors are correlated with themselves at different 
points in time (autocorrelation) and if they are, how long these autocorrelations persist 
over time. This long-term persistence is characteristic of complexity, and it suggests a 
pattern of adaptive responsiveness of systems to external influences that is not 
necessarily predictable (Stadnitski, 2012). Conventional time series analysis (Box & 
Jenkins, 1970) is specifically designed to estimate short range dependencies in a string of 
time-ordered observations. The estimation of persistence of these dependencies over the 
longer term is a relatively recent development in the modeling of dynamical processes 
(Beran, 1994; Granger & Joyeux, 1980; Hosking, 1981). Since complexity typically 
manifests itself as a long-term process (Bak, 1996; Waldrop, 1992), the investigation of 
patterns of variability over time within individuals is an effective method to determine 
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the extent to which they are indicative of complexity. Appendix A offers a brief synopsis 
of time series analysis and its uses. 

Since the analysis of complex time dependencies in education is extremely rare at 
this point, this article aims to describe the type of data that would permit us to study 
complex processes, and illustrate how such data may indicate complex processes. This 
paper consists of two parts. The first part describes four distinct error scenarios that 
typically occur in successively ordered data, and it uses simulated data to illustrate the 
kind of dependencies in measurement errors that produce non-ergodic data, and data 
showing evidence of complexity. A second section will describe the trajectory of daily 
high school attendance ratings over a seven year period in two urban high schools to 
demonstrate the relevance of ergodicity as a methodological construct to real 
educational data, and to illustrate what complexity looks like in daily attendance data. 
The implications of finding complexity in daily attendance data over time is that 
attendance patterns are unpredictable and indicate the adaptive capacity the system to 
changing circumstances (Stadnitski, 2012). 

Illustrating the Idea of Ergodicity in Simulated Data 

In the section that follows, I will show four typical patterns of variability that are 
typically distinguished in the dynamical literature to illustrate what ergodic and non- 
ergodic time series data might look like if we assume randomness in the cross-sectional 
distribution of errors. These simulated trajectories provide the background for the 
discussion of some real school attendance data presented afterwards. 

Four trajectories were generated with 1,500 sequentially ordered data points to 
illustrate four distinct error scenarios that are well-known in the dynamical literature 
(see Appendix B for further details): 

• Random white noise: a distribution over time in which all observed measurements 
are randomly distributed around the average of the series. In this case, errors are 
uncorrelated, consistent with the ergodic assumption based on random cross- 
sectional data; 

• Short-term autoregression: a pattern where individual observations in a time series 
tend to cluster with observations nearby in the series. In this instance, errors are 
correlated with closely neighboring ones; 

• Pink noise: a pattern of unpredictable cycles in the series with errors being 
correlated over the longer term of the trajectory, and 

• Brownian motion: errors are correlated over the entire time spectrum. 
Observations in close vicinity cluster tightly together, while over the course of 
the entire time spectrum, the series shows volatility. 

The first and second scenario characterize a linear process, the third and fourth scenario 
are generally seen as being indicative of complexity, as they include stable as well as 
unstable non-random patterns of behavior. Furthermore, assuming a cross-sectional 
random distribution, the ergodic assumption would produce the series illustrating 
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random white noise, whereas the other three scenarios would be considered non-ergodic 
under that assumption. 


White Noise 


Autoregression 




Pink Noise 


Brownian Motion 




Figure 1. Frequency distribution of four simulated time series: a. White noise, b. Autoregression, 

c. Pink noise, d. Brownian motion 

To further illustrate the idea of ergodicity, I will first show what the distributions of 
these four simulated data patterns look like when treated as if they are cross-sectional, 
i.e., a set of independent observations frozen in time. The histograms in Figure 1 show 
that with the exception of Brownian motion, these simulations appear to be very similar 
to each other. In spite of the qualitative differences in their time-dependent 
characteristics, they yield decent approximations of textbook univariate normal 
distributions (as they were designed to do). 

In the section that follows, the time dependent aspects of the simulated data are 
presented in three ways: the original trajectories are shown first. These plots offer an 
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initial impression of how an outcome of interest might behave over time and whether 
the temporal pattern suggests stability, instability and/or transformation. Next, 
Autocorrelation Function (ACF) plots are shown. The ACF plot is a diagnostic device 
that is used to visualize the correlations between observations by plotting those 
correlations to the number of intervals, or lags, over which they are computed. ACF plots 
are generated over 30 lags of the series to show the autocorrelation patterns between 
observations within a narrow time frame, and then, an ACF plot is generated over the 
total number of lags in the series to illustrate the long-term dependency patterns that 
would be indicative of complexity. 1 The spikes in the ACF plots indicate the magnitude 
of the autocorrelations at each of the given lags, and the dotted lines indicate the 
confidence intervals constructed around these correlations (Cryer & Chan, 2008). The 
illustrations below show how each of the four error scenarios typically manifests itself in 
these plots. 

The section that follows provides a more detailed discussion of the four 
aforementioned patterns as well as a simulated example of each. These simulations were 
specifically designed to produce the patterns described. 

Random white noise: Statisticians try to model as much of the variation in their data 
as possible, and seek to ensure that remaining measurement error is random and 
normally distributed (Gaussian white noise). Knowing past observations does not 
improve one's prediction of new outcomes in case of white noise, i.e., there is no memory 
in the series. If cross-sectional error patterns are random and the ergodic assumption 
holds, the (unmeasured) noise pattern over time would look like the simulated trajectory 
shown in Figure 2a. Knowing the trajectory does not improve our ability to predict 
future observations, which is to say that the error pattern does not add information to 
what measures of central tendency are variability are telling us about the distribution of 
outcomes over time. Modeling time does not help us in this case. The lack of pattern in 
the autocorrelation functions in Figure 2b and 2c further illustrate this randomness. 
Barring a few incidental exceptions, one of the spikes are outside of the 95% confidence 
interval indicated by the dotted lines. There does not seem to be any discernable pattern 
across lags, either in the short-term of the long-term version of the plot. 

Short-term autoregression is said to occur when predictive accuracy in a time series 
can be improved by considering the values of immediately preceding observations, i.e., 
there is reliance on the short memory in the trajectory. Autoregression is one of the more 
frequently observed types of non-randomness observed in time-dependent data. Figure 
3a shows an example. The non-randomness can be seen in the somewhat clustered 
pattern of the observations suggesting relatedness between nearby observations. This 
pattern is more clearly discernible in the autocorrelation function at 30 lags (Figure 3b), 
which shows spikes at the first five lags indicating statistically significant correlations of 
given observations with their nearby neighbors. The autocorrelation function at 1,500 


1 The choice of the number of lags for the short-term ACF plot is arbitrary and primarily meant to 
ensure that all short-range dependencies are visualized. Twenty-five or 30 are typical 
choices for the number of lags plotted. The long-term plots simply include all lags in the 
series. 
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lags, the full extent of the trajectory (Figure 3c), illustrates another aspect of short-term 
autoregression, namely that autocorrelation quickly recedes to non-significance as the 
number of lags increases. 

Pink noise characterizes long memory in a trajectory, i.e., when given observations are 
correlated with observations from a long time ago, or a long time hence. Dynamical 
scholars (e.g., Beran, 1994; Stadnitski, 2012; Wagenmakers, Farrell, & Ratcliff, 2004) 
interpret the occurrence of such long memory processes as evidence of complexity, 
fractality, and self-organized criticality in the series, processes that are difficult to detect 
by eye-balling the data but that affect the predictability of the series. Figure 4a shows a 
trajectory displaying pink noise. Significant vacillation around the mean of the series can 
be observed over the longer term of the series. The autocorrelation function at 30 lags 
(Figure 4b) illustrates the strong dependency between observations: autocorrelations 
recede very slowly as the number of lags increases, but ultimately, they do reduce to a 
non-significant random pattern, as shown in Figure 4c. 

Brownian motion is an error pattern named after the botanist Robert Brown (1773 - 
1858), who, in microscopic observations, observed persistent fluctuations of particles 
found in pollen grains in water (Feder, 1988). Figure 5a shows a typical example of 
Brownian motion. The tight clustering of immediately neighboring observations coupled 
with unstable long-term behavior is illustrative of this error pattern. Because of the 
volatile appearance of the trajectory, traditional measures of central tendency do not 
characterize the distribution of results very well, as can be seen in Figure 5a in the 
continuous deviations of the trajectory from its mean of zero. The autocorrelation 
function at 30 lags illustrates why Brownian motion is also referred to in the literature as 
an infinite memory process (Stadnitski, 2012): within the 30-lag time frame (Figure 5b) 
there is no clearly discernible recession of the autocorrelations to non-significance 
indicating a high dependence of the value of any given observation in the trajectory on 
many of its neighboring values, including those not nearby. Figure 5c shows as well that 
while the autocorrelations ultimately do recede to non-significance as the number of lags 
increases, small but very persistent dependencies remain, as can be seen in the highly 
predictable response of the autocorrelations to variations in the lag size. The systematic 
nature of this process is taken to indicate that there is 'hidden determinism' in the data 
requiring further study, as was the case in Brown's original study. 
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White Noise - No Memory 



Figure 2. When the time series displays a random pattern, it has no predictive value, a. The 
original trajectory (top panel); b. the autocorrelation function (ACF) plot at 30 lags (middle 
panel); c. the ACF plot over the entire series (bottom panel). N=l,500. 
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Autoregression - Short Memory 



Time 




text of 

Figure 3. Autoregression in an ARFIMA (1, 0, 0) model with cf> = 0.7. a. The original trajectory (top 
panel); b. the autocorrelation function (ACF) plot at 30 lags (middle panel); c. the ACF plot over 

the entire series (bottom panel). N=l,500. 
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Pink Noise - Long Memory 


CO 

C\J 

o 



Time 



Figure 4. Pink noise in a simulated ARFIMA (0, d, 0) model with d = .35. a. The original trajectory 
(top panel); b. the autocorrelation function (ACF) plot at 30 lags (middle panel); c. the ACF plot 

over the entire series (bottom panel). N=l,500. 
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Brownian Motion - Infinite Memory 
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Figure 5. Infinite memory in a simulated ARFIMA (0, d, 0) model at d = 1. a. The original 
trajectory (top panel); b. the autocorrelation function (ACF) plot at 30 lags (middle panel); c. the 
ACF plot over the entire series (bottom panel). /V=1,500. 


Daily Attendance Rates in Two High Schools 

In the section that follows, I will illustrate how the statistical principles outlined above 
apply to the analysis of real time series data in education. Daily school attendance is 
typically reported in terms of averages over a weekly, monthly or yearly period (Kemple 
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& Snipes, 2000; National Center for Education Statistics, 2008) , or in terms of school 
building absenteeism rates (Balfanz & Byrnes, 2012). Aggregated across the time 
dimension, these measures contain little information about the extent to which 
attendance rates fluctuate in the course of a given school year, how stable they are, or 
whether they taper off toward the end of the year, let alone whether the trajectory shows 
evidence of complexity. It is precisely these types of variability that complex dynamical 
scholarship takes an interest in. 

Real data were collected from the New York City Department of Education, which, 
ever since the 2004-2005 school year has provided weekly updates on the daily 
attendance rates for each of its schools. Two trajectories were selected for further 
description: a large high school (School 1, n = 1,230 students) whose attendance 
trajectory is relatively stable, and a small one (School 2, n = 360 students) whose 
trajectory is much less stable. Both schools predominantly serve students from minority 
groups, with 50 percent of students listed as eligible for free/reduced priced lunch in the 
School 1, and 100 percent in School 2. After data cleaning, the attendance trajectories in 
School 1 and 2 consisted of 1,335 and 1,345 observations, respectively. It is important to 
realize that these data provide school level information based on the attendance of 
successive cohorts of students. The analysis of attendance trajectories of individual 
students falls outside of the scope of this paper 

As in the simulations above, the daily attendance data is shown three ways for both 
schools: the time series containing the original observations, an ACF plot for the short 
term dependencies, and an ACF plot for the entire series. Figure 6a shows the daily 
attendance rates in School 1 over the seven year period; Figure 7a does so for School 2. 
These two trajectories show considerable non-random variability and periodic patterns 
in both schools, although these trends seem to be more pronounced in School 2 than in 
School 1. It can also be seen in the figures that, contrary to the artificial data described 
above, a fair number of extreme observations can be seen in both schools (instances of 
low attendance due to inclement weather, upcoming holidays and vacations, etc.). While 
the variability on normal school days is much lower in School 1, these low points are 
more clearly discernible in School 2, suggesting more turbulence. The ACF plots over 30 
lags (Figures 6b and 7b) indicate that particularly in School 2, a strong cyclical pattern 
can be seen at the fifth lag, which is equal to a weekly cycle (Mondays being correlated 
with the previous Monday, etc.). In School 1, this trend is also present, but less 
pronounced. The ACF plots for both schools indicate the presence of short-term 
dependencies at lags 1 and 2, and also at lags 3 and 4 in School 1. In School 2, these latter 
dependencies may have gotten absorbed by the prominence of the lag 5 correlation. The 
long-term ACF plots shown in Figures 6c and 7c also show that neither of the two 
schools is free of long-term error dependency, as many spikes cross the 95 percent 
confidence boundary. However, it is also clear that a more prolonged pattern of 
dependency is apparent in School 2 than in School 1. Statistical estimation confirms that 
the pattern for School 2 is pink noise, whereas it contains only short-term autoregression 
for School 1 (see Appendix B and Koopmans, 2015). 
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Daily Attendance Rates: School 1 
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Autocorrelation Function (Entire Series): School 1 



Figure 6. Daily attendance rates and autocorrelation functions for School 1. a. The original 
trajectory (top panel); b. the autocorrelation function (ACF) plot at 30 lags (middle panel); c. the 
ACF plot over the entire series (bottom panel). N=l,335. Reproduced from Koopmans (2015) with 
permission from the Society for Chaos Theory in Psychology and life Sciences. 
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Daily Attendance: School 2 
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Figure 7. Daily attendance rates and autocorrelation functions for School 2. a. The original 
trajectory (top panel); b. the autocorrelation function (ACF) plot at 30 lags (middle panel); c. the 
ACF plot over the entire series (bottom panel). N=l,345. Reproduced from Koopmans (2015) with 
permission from the Society for Chaos Theory in Psychology and life Sciences. 



Discussion 

Complexity is a time-sensitive process, and understanding this sensitivity requires a 
quantitative approach to the analysis of the passage of time in the observed data. This 
paper illustrates an approach where time is analyzed as a source of variability in daily 
attendance outcomes to determine whether that variability is indicative of stability or of 
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complex patterns. The availability of daily attendance ratings over long periods of time 
in some urban school districts permits this kind of assessment, allowing us to determine 
whether daily high school attendance shows evidence of complexity, and second, to 
establish whether the ergodic assumption holds in this case. The study presented in this 
paper also compares the findings of the attendance study against simulations of typical 
patterns of linearity (white noise, short-term autoregression), and complexity (pink 
noise. Brownian motion). 

With the exception of the white noise scenario presented in Figure 1 above, all data 
discussed in this paper show pronounced time dependencies (correlated errors), calling 
into question the assumption that if observations are repeated, the distribution of 
within-subject errors will be random in the same way as we would expect data to be in 
the cross-sectional case (ergodicity). The case of daily high school attendance illustrates 
the importance of investigating time dependencies in data usually reported as time- 
independent averages, e.g., over a weekly, monthly or yearly period, assuming 
implicitly that a mean or regression slope effectively characterizes attendance rates over 
the entire time spectrum, and that there is random variability around the mean without 
any period-dependent fluctuations. The results for the two schools presented here 
challenge this convention, and they show that these traditional summary measures 
conceal a time dependency that is actually very interesting, making the case for a 
dynamical approach. 

Summarizing the daily attendance rates in School 2 in terms of traditional measures 
of central tendency and variability would also conceal the evidence for complexity. The 
signs for it show up in in the original series shown in Figure 7a, which, in addition to 
short term fluctuations has a slowly undulating pattern over the longer term that 
resembles the pink noise pattern shown in the simulation in Figure 3a. In addition, the 
ACF plots in Figures 7b and c showing significant associations between observations not 
only over short but also over longer stretches as time. The slow recession of 
autocorrelations into non-significance is illustrative of pink noise, and generally taken as 
an indicator of complexity. This pattern that suggests adaptability of a system (the 
school serving its students in this case), to changing circumstances while at the same 
time maintaining a relatively stable appearance overall (Stadnitski, 2012). School 1, on 
the other hand, the features of the trajectory are relatively constant over the entire 
trajectory, with much less variability in daily attendance rates and a trajectory that goes 
fairly straight over the seven-year period. This pattern suggests less vulnerability of 
attendance rates to external events and circumstances in this school. 

Comparison of the four simulated trajectories illustrates the gradual nature of the 
difference in the visual appearance of the error scenarios, from the apparent random 
variability between individual observations that are stable around the mean in the white 
noise pattern to the very tightly clustered observations in the wildly fluctuating pattern 
characterizing Brownian motion. Autoregression and the pink noise both fall 
somewhere between these two scenarios with greater fluctuation in pink noise than in 
the autoregression. 
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What does it tell us about daily high school attendance that the attendance trajectory 
in one of the two schools resembles pink noise? Beran (1994) suggests that pink noise 
trajectories indicate a complex pattern of influences between endogenous processes 
(here, attendance rates over time) and exogenous ones. In this context, examples of 
exogenous processes would be parental support, or teacher and school leadership 
effectiveness (Astone & McLanahan, 1991; Balfanz & Byrnes, 2012; Roby, 2003). In that 
way, pink noise reflects healthy adaptive behavior in the system of interest (Stadnitski, 
2012). Pink noise is also seen as indicative of a systemic tension-release pattern known as 
self-organized criticality. The instructive examples of self-organized criticality in the 
literature are the sand piles and piles of rice that are built by gradually pouring more 
sand or rice over them (Bak, 1996; Jensen, 1998). At given critical points, the tension 
between the grains in the piles thus created requires an adjustment to reduce this 
tension, which is manifest in occasional avalanches in the pile. A similar tension-release 
pattern may be hypothesized for School 2, where the continuous demand for 
participation in school activities creates the need for occasional releases, thus creating 
the somewhat undulating pattern shown in Figure 7a. A similar pattern is seen in School 
1, but it is less pronounced and it fails to reach statistical significance in the estimation 
process. 

Pink noise is also seen as evidence for fractality (Stadnitski, 2012; Wagenmakers, 
Farrell, & Ratcliff, 2004). Within the time series analytical framework, fractality is 
defined in terms of the fluctuations that appear in the series independent of scale, giving 
rise to a pattern of self-similar humps within bumps' (Kaplan & Glass, 1995). By 
definition, the timeframes within which these bumps occur are not clearly demarcated in 
a fractal pattern, bringing an element of unpredictability to daily attendance patterns, 
which otherwise tends to display fairly predictable weekly cycles (Koopmans, 2011; 
2015). 

The analyses presented here concern individual cases rather than sampling groups, 
and as such, the analyses illustrate both the strengths and limitations of single case 
designs. The highly differentiated nature of the data requires a large number of 
observations over the time spectrum without the information loss that results from the 
aggregation of observations across cases. On the other hand, these results tell us very 
little about the specific circumstances under which we can expect complexity to occur in 
daily attendance trajectory. Future work needs to give us a better understanding of how 
the behavior of daily attendance over time relates to exogenous processes such as the 
ones mentioned above, and variables that are malleable through policy, such as for 
instance school and class size. Such work should triangulate the evidence of the presence 
or absence of complexity in daily attendance rates with other school characteristics, such 
as school demographics, school climate, parental involvement and school building 
leadership. Another approach to obtaining a better understanding of the impact of 
exogenous processes on daily high school attendance is a mediation analysis in which 
daily attendance would conceptualized as an outcome trajectory, which is linked to a 
trajectory describing an exogenous process, allowing a direct attribution of shifts in the 
behavior of one to the behavior of the other. More detailed discussions of this approach 
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can be found in Gu, Preacher and Ferrer (2014), Molenaar et al. (2009), Vallacher and 
Novak (2009), and Wong, Vallacher and Novak (2014). 

In traditional cross-sectional designs, many statistical approaches to the data require 
us to assume that the distribution of the error variance introduced by the individual 
cases is randomly distributed around the mean of their group. Observing a 
corresponding distribution across the time spectrum would allow us to conclude that 
the data are ergodic. The prolonged 'stickiness' shown in the pink noise figures above 
serves as a confirmation of complex patterns. With or without triangulation, these 
patterns of variability described in this paper have their own intrinsic interest, in that 
they show that the particularities of the passage of time, and the expression the 
complexity is only discernible if very large chunks of time are considered for analysis. 

Much information is lost when it is taken for granted that there is no need to repeat 
observations for individual cases, because the fluctuations you would find around the 
mean of the observations would be random anyway. The analysis presented here 
illustrates why knowing temporal variability in attendance is relevant to scholars who 
are interested in how the educational process evolves as time passes, as well as to school 
practitioners who can use this information to inform scheduling the school day and the 
school year, and make policy adjustments based on the rates at which their students 
show up on any given day. 
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Appendix A: Time Series Analysis in a Nutshell 

Time series analysis is a technique that is used to describe strings of temporally ordered 
observations statistically. Time series can be found daily in many daily newspapers to 
report trends in the weather (temperature, precipitation) and economic indicators 
(mortgage rates, interest rates). In such cases, time series can provide information about 
whether there is an upward or downward trend in the measurements, whether there are 
cyclical fluctuations (e.g. weather seasons) or random fluctuations or unpredictability in 
the series. Time series is not only a descriptive method, it can also be used for 
forecasting if it can be reasonably assumed that the trends observed in the series will 
persist into the future. The approach is used, for instance, to predict variability in stock 
prices. 

In analyses such as those presented in this paper, several features are of particular 
importance. The first one is autocorrelation, a statistical association between given 
observations in the series and previous observations in the same series. If such an 
association exists, then knowing those previous observations enhances the prediction of 
subsequent observations, and therefore it is an important modeling parameter. The 
second characteristic is seasonality, a cyclical pattern in the data. In addition to the 
aforementioned weather seasons, other examples of seasonality would be annual cycles 
in the levels of carbon dioxide emission in the atmosphere (Cryer & Chan, 2008), or the 
weekly cycles in daily high school attendance rates discussed in this paper. 

Time series analysis is particularly suitable to detect complex behavior, as is the case 
when there is a cyclical fluctuation in behavior over time, in which the timing of the 
cycles is unpredictable (pink noise), or when there are no cycles at all but just highly 
volatile behavior (Brownian motion). This paper describes both of these processes. 
Another well-known example of a more complex time series is chaos, a deterministic 
temporal pattern in which tiny fluctuations early in the series are correlated with much 
larger fluctuations later on (Kaplan & Glass, 1995; Sprott, 2003). 

The notational conventions and formal specification of these models can be found in 
many time series texts, such as Box and Jenkins (1970), and Cryer and Chan (2008). 
Beran (1994) focuses specifically on the use of time series to model long-term processes. 

Appendix B: Technical Note 

An Autoregressive Fractionally Integrated Moving Average (ARFIMA) method was 
used to generate three of the four the simulated trajectories, and for the analysis of the 
attendance data. Brownian motion was simulated with phytools (Revell, 2012). For all 
four simulations, a normally distributed set of 1,500 successive data points was specified 
with a mean of zero and a standard deviation of one. For the short-term model, one 
autoregression parameter was fixed at (p = .70 for lag 1. To generate pink noise, the 
differencing parameter was set to d = .35. To generate Brownian motion, this parameter 
was set at d = 1. See Cryer and Chan (2008) and Stadnitski (2012) for the notational 
conventions. 
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For the analysis of daily attendance, a competitive model selection strategy 
(Wagenmakers, Farrell & Ratcliff, 2004) was used to compare a baseline assuming 
randomly distributed error patterns (white noise) with a short term autoregression 
model and a long term model (pink noise). The long-term model best described the 
attendance patterns in School 2: the observed differencing parameter was d = .13. A 
model with short-term components, but without long-term estimation, best fitted the 
trajectory for School 1. No Brownian motion was observed in either school. See 
Koopmans (2015) for further detail. 
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